Interpolation: Summary & Update

2023-12-06

Inverse Distance Weighting

  • Spatial interpolation that calculates cell values (\(V\)) based on a weighted average of sample points.

  • The weights are inversely proportional to the distance between the predicted and sampled locations.

    • This means points further away have less influence on the predicted value.
  • Predicted value \(V\) is calculated as \[ V = \frac{\sum_{i=1}^n v_i \frac{1}{d_i^p}} {\sum_{i=1}^n \frac{1}{d_i^p}} \] where \(d\) is the distance between prediction and measurement points, \(v_i\) is the measured parameter value, and \(p\) is a power parameter.

  • Well-known method

  • Considered among the most “simple” spatial interpolation methods

  • Output is sensitive to clustering

  • Does not provide standard errors

  • Typically use training/test datasets to evaluate “goodness of fit”

Important parameters

Modified from ArcGIS Pro

Power Parameter

  • \(p\) controls the significance of the known points based on their distance from the predicted point.
  • As \(p\) increases, the interpolated values approach the value of the nearest sample point.
  • Default value is typically 2, although no mathematical or practical reason for this.

Neighborhood

  • How far and where to look for measured values used to make the prediction.

Example

Modified from Interpolation in R

Precipitation data from Texas:

[21 stations]

[inverse distance weighted interpolation]

[inverse distance weighted interpolation]

Superchill in NS

Map of Nova Scotia with “measured” superchill likelihood. (Gold buffer is 2 km from the coast)

We want to interpolate AROUND the province, not THROUGH it!!

Inverse Path Distance Weighting

  • Adds a “cost raster” to the distance calculation.
  • The cost of going through the land is so high that

Cost Raster

Training points (black) and test points (red).

Note that only point is selected per grid box (so some stations are left out).

Interpolation

With training dataset:

To Do

  • Better understand the paths created.
  • Determine if can use all stations or if choosing 1 station per grid addresses the clustering issue.
  • Set up cross-validation (n-fold or k-fold).
  • Compare to regular IDW.
  • Figure out mapping + colour scale.
  • Repeat for heat stress.

Questions

  • What resolution are the other layers?
  • What final output is required?